DS 6030 | Fall 2022 | University of Virginia


Intro

Let’s suppose it’s 2010 and you’ve just finished watching a collection of 4 movies on your handy DVD player. You still have some free time left and you decide that you want to spend the remainder of your time watching more movies. You have 3 other DVDs lying around, but you do not know which one will be good to watch - a travesty! In order to resolve this, you call up some of your closest friends and ask them to rate the DVD options that you have. While you’re at it, you ask your friends to also rate the movies you already watched.

Let’s say there are 4 movies that you already watched: - WM1, WM2, WM3, WM4

And let’s say the 3 new movies you’re deciding between are: - NM1, NM2, NM3

You were able to get a response from 3 of your friends: Hyun, Eric, and Chunru. They weren’t able to provide ratings for all 7 movies, but you were still able to summarize their responses in a simple table:

WM1 WM2 WM3 WM4 NM1 NM2 NM3
Hyun 5 3 2 1 4 3
Eric 4 5 2 5
Chunru 3 1 4 5 3

In order to compare, you added your ratings to the table as well. The final table will look like this:

WM1 WM2 WM3 WM4 NM1 NM2 NM3
Hyun 5 3 2 1 4 3
Eric 4 5 2 5
Chunru 3 1 4 5 3
You 4 2 3 5

Given this information, will you be able to decide on which movie to watch next? The answer is yes! Thankfully, it’s 2022 and this concept has been around for a while now. Even more so, this concept is applied in technology through applications that predict which options a user will choose. These applications are called Recommendation Systems.

This tutorial will go through the concept of Recommendation Systems, specifically the technique of Collaborative Filtering. The following will be discussed:

  1. Background of Concepts
  2. Mathematical Theory
  3. Coding Example (R)
  4. Coding Example (Python)

Background of Concepts

Before jumping into the math, it is important to go through some examples of recommendation systems as well as the concept of the Utility Matrix.

Recommendation Systems

We mentioned earlier that recommendation systems are being applied in different technologies nowadays (Ullman). Some of these are:

  1. Movie Recommendations - Like the initial example, recommending movies is one of the most popular examples for recommendation systems. Streaming services such as Netflix, Hulu, Disney, and others all provide recommendations for their users to try.
  2. Product Recommendations - Online shopping is another one of the avenues where recommendation systems are very useful. Users receive suggestions on products to buy based on what similar users are buying.
  3. Song Recommendations - Another example of recommendation systems are with music streaming services where users recveive suggestions for songs that they can add to their playlists.

There are definitely are more examples but these should be able to provide an idea of the vastness of use cases where recommendation systems can be applied. Recommendation systems are helpful in recommending a wider range of items to its users where a physical store would not be able to.

The Utility Matrix

Recall the table we created earlier for the initial example where the different friends had their ratings for different movies. That table is a good representation of the utility matrix:

WM1 WM2 WM3 WM4 NM1 NM2 NM3
Hyun 5 3 2 1 4 3
Eric 4 5 2 5
Chunru 3 1 4 5 3
You 4 2 3 5

Typically, recommendation systems have two types of entities: users and items. The utility matrix represents the degree of preference that a user has for certain items. In our example, the degree of preference is measured on a movie rating scale of 1 to 5. It is also important to note that it is assumed that the utility matrix is sparse, meaning there are blank values in the matrix. This is good for replicating the real-world scenario where not all users would have a rating for all of the items.

The goal of recommendation systems is to fill out the utility matrix by predicting values for the blank ones. Alternatively, recommendation systems can also predict the values for some entries in each row which are expected to have higher values. This would make it so that the recommendation system does not have to fill out all of the empty cells but just a large subset of cells. Applying techniques such as clustering to the original matrix is also something that be considered so that less values need to be predicted (Ullman).

Collaborative Filtering

Recommendation systems can be implemented using different techniques, but for this tutorial we will be focusing on collaborative filtering. This technique focuses more on the similarity of users with each other instead of similarity of features between items. This follows the intuition that similar users will tend to prefer similar items and thus the recommendations are based off of this idea (Ullman). You can also think of this in terms of having neighbors, where neighbors are based on similar item preferences. A user will receive recommendations that are influenced by what their neighbors prefer (Gormley).

Before implementing collaborative filtering recommendation systems, we will first discuss the mathematical theory that works in the background. Specifically, we will discuss how collaborative filtering can be done through cosine similarity or matrix decomposition

Mathematical Theory

Cosine Distance

To measure the similarity of users from the utility matrix, we can use the Cosine Distance method, which calculates the angle between the ratings of different users. In this method, we treat the blank entries as zero. One thing to note about this method is that it latently treats the lack of a rating as disliking the movie (i.e. rating = 0) (Ullman).

[ADD GENERALIZED FORMULA?]

Using our example, we can check the similarity of some of your friends. To get the cosine angle between Hyun and Eric, we can use this equation:

\(\frac{(2\times5) + (4\times2) + (3\times5)}{\sqrt{5^2+3^2+2^2+1^2+4^2+3^2}\sqrt{4^2+5^2+2^2+5^2}} = 0.493\)

We can also check for the cosine angle between Hyun and Chunru:

\(\frac{(5\times3) + (2\times4) + (1\times5) + (3\times3)}{\sqrt{5^2+3^2+2^2+1^2+4^2+3^2}\sqrt{3^2+1^2+4^2+5^2+3^2}} = 0.597\)

Since a larger cosine value implies a smaller angle and therefore a smaller distance, this measure tells us that Hyun is closer to Chunru than to Eric.

In addition, we can normalize ratings by subtracting the average rating of each user. In this case, the movies with low ratings would become negative and those with high ratings would be positive. Users with contrasting ratings on movies would have consine vectors in opposite direction, while users with similar ratings would have smaller distances between them (Ullman).

In our example, the utility matrix after normalizing would be:

WM1 WM2 WM3 WM4 NM1 NM2 NM3
Hyun 2 0 -1 -2 1 0
Eric 0 1 -2 1
Chunru -0.2 -2.2 0.8 1.8 -0.2
You 0.5 -1.5 -0.5 1.5

Then the cosine of the angle between Hyun and Eric is:

\(\frac{(-1\times1) + (1\times-2)}{\sqrt{2^2+(-1^2)+(-2^2)+1^2}\sqrt{1^2+(-2^2)+1^2}} = -0.387\)

The cosine of the angle between Hyun and Chunru is:

\(\frac{(2\times-0.2) + (-1\times0.8) + (-2\times1.8)}{\sqrt{2^2+(-1^2)+(-2^2)+1^2}\sqrt{(-0.2^2)+(-2.2^2)+0.8^2+1.8^2+(-0.2^2)}} = -0.512\)

Still we can see that Hyun is closer to Chunru than to Eric.

UV (Matrix) Decomposition (Need to Paraphrase)

[PARAPHRASE] An approach to estimating the blank entries in the utility matrix is to conjecture that the utility matrix is actually the product of two long, thin matrices.

For example, if we start with the utility matrix M, with n rows and m columns (i.e., here are 4 users and 7 movies, so martix M is \(4 \times 7\)), then we might be able to find a \(n \times d\) matrix U and a \(d \times m\) matrix V, such that the product matrxi UV closely approximates M in those entries where M is nonblank. After that, we can then use the entries in the product UV to estimate the corresponding blank entries in utility matrix M. This process is called UV-decomposition of M. [PARAPHRASE]

In our example, we decompose the utility matrix M (\(4 \times 7\)), into two matrix U (\(4 \times 2\)) and V (\(2 \times 7\))

\[ \begin{bmatrix} 5 & & 3 & 2 & 1 & 4 & 3 \\ & 4 & & 5 & & 2 & 5 \\ 3 & 1 & & 4 & 5 & & 3 \\ 4 & 2 & 3 & 5 & & & \end{bmatrix} = \begin{bmatrix} u_{11} & u_{12} \\ u_{21} & u_{22} \\ u_{31} & u_{32} \\ u_{41} & u_{42} \end{bmatrix} \times \begin{bmatrix} v_{11} & v_{12} & v_{13} & v_{14} & v_{15} & v_{16} & v_{17} \\ v_{21} & v_{22} & v_{23} & v_{24} & v_{25} & v_{26} & v_{27} \end{bmatrix} \] Theoratically, we can find many candidates of matrix U and V, whose product UV is close to V. So we need a measurement to pick the one that is closest to M. The typical choice is called the root-mean-square error (RMSE), where we sum over the square root of the difference of nonblank entries from the martix M and the corresponding entries from product matrix UV.

To find the UV-decomposition matrix with the least RMSE, we start with an arbitrary guess of matrix U and V, and the repeatedly adjusting one entry in U or V to make the RMSE smaller. Here we use our example to show the process of UV-decomposition.

Step 1 : Start with matrices U and V with all entries 1:

\[ \begin{bmatrix} 1 & 1 \\ 1 & 1 \\ 1 & 1 \\ 1 & 1 \end{bmatrix} \times \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \end{bmatrix} = \begin{bmatrix} 2 & 2 & 2 & 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 & 2 & 2 & 2 \\ \end{bmatrix} \]

Step 2 : Alter u11 to reduce the RMSE as much as possible:

\[ \begin{bmatrix} x & 1 \\ 1 & 1 \\ 1 & 1 \\ 1 & 1 \end{bmatrix} \times \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \end{bmatrix} = \begin{bmatrix} x+1 & x+1 & x+1 & x+1 & x+1 & x+1 & x+1 \\ 2 & 2 & 2 & 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 & 2 & 2 & 2 \\ \end{bmatrix} \]

The contribution to the sum of squares from the first row is:

\[ (5-(x+1))^2 + (3-(x+1))^2 + (2-(x+1))^2 + (1-(x+1))^2 + (4-(x+1))^2 + (3-(x+1))^2 \\ = (4-x)^2 + (2-x)^2 + (1-x)^2 + (-x)^2 + (3-x)^2 + (2-x)^2 \]

and we want a value of x that minimizes the sum, so we take the derivative and set that equal to 0, as: \[ -2((4-x)+ (2-x)+ (1-x)+ (-x) + (3-x) + (2-x)) = 0 \] from which we get x = 2.

Now update the U and V martix with the enrty that \(u_{11}=2\).

\[ \begin{bmatrix} 2 & 1 \\ 1 & 1 \\ 1 & 1 \\ 1 & 1 \end{bmatrix} \times \begin{bmatrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \end{bmatrix} = \begin{bmatrix} 3 & 3 & 3 & 3 & 3 & 3 & 3 \\ 2 & 2 & 2 & 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 & 2 & 2 & 2 \\ 2 & 2 & 2 & 2 & 2 & 2 & 2 \\ \end{bmatrix} \]

Step 3 : Alter \(v_{11}\) to reduce the RMSE as much as possible and update \(v_{11}\)

Step 4 : Iterate the steps above again and again until the RMSE.

[CHECK UV PART IF NEED TO PARAPHRASE]

Now that we’ve discussed the math behind this, let’s move on to some coding!

Coding Example (R)

To apply the concepts we’ve learned so far, let us implement these concepts in R. R has a built-in library recommenderlab, which we will go to later. Before going that, let’s first implement a basic cosine similarity solution manually in order to illustrate the point.

#library(R6030)     # functions for DS 6030
library(tidyverse) # functions for data manipulation   
library(recommenderlab) # function for remmondation systems

We will continue to use the initial movie rating example. Let’s figure out which movie you should watch out of the three new movies.

movie_ratings <- data.frame(
  WM1 = c(5, NA, 3, 4), 
  WM2 = c(NA, 4, 1, 2),
  WM3 = c(3, NA, NA, 3),
  WM4 = c(2, 5, 4, 5),
  NM1 = c(1, NA, 5, NA),
  NM2 = c(4, 2, NA, NA),
  NM3 = c(3, 5, 3, NA)
)
rownames(movie_ratings) <- c('Hyun', 'Eric', 'Chunru', 'You')
movie_ratings

We will need a function that computes cosine similarity as it was discussed earlier in this tutorial.

# The two vectors must have the same length
cosine_similarity <- function(vec_1, vec_2) {
  vec_len <- length(vec_1)
  
  # NA values are replaced with 0
  vec_1[is.na(vec_1)] <- 0
  vec_2[is.na(vec_2)] <- 0
  
  # Computing the denominator
  vec_1_denom <- sqrt(sum(vec_1^2))
  vec_2_denom <- sqrt(sum(vec_2^2))
  denominator <- vec_1_denom * vec_2_denom
  
  # Computing the numerator
  tib = tibble(vec_1 = vec_1, vec_2 = vec_2)
  tib <- tib %>% mutate(products = vec_1 * vec_2)
  numerator <- sum(tib$products)
  
  # Return the cosine similarity
  return (numerator / denominator)
}

Let’s check how similar your taste in movies are to the 3 friends who responded. Remember, the higher cosine similarity value means more similarity.

# Extract each person's ratings as vectors
you <- as.numeric(as.vector(movie_ratings['You',]))
hyun <- as.numeric(as.vector(movie_ratings['Hyun',]))
eric <- as.numeric(as.vector(movie_ratings['Eric',]))
chunru <- as.numeric(as.vector(movie_ratings['Chunru',]))


# Compute distance using cosine similarity
similarities <- data.frame(
  cosine_similarity = c(cosine_similarity(you, hyun), cosine_similarity(you, eric), cosine_similarity(you, chunru))
)
rownames(similarities) <- c('Hyun', 'Eric', 'Chunru')
similarities

Based on these results, we can see that your taste in movies is most similar to Hyun’s. We will use these similarity scores as weights for predicting your movie ratings for the three new movies.

# Function for computing the weighted average
movie_rating_weighted_average <- function(movie, friends) {
  denominator <- 0
  numerator <- 0
  for (friend in friends) {
    friend_similarity <- similarities[friend,][1]
    friend_rating <- movie_ratings[friend, movie][1]
    
    # Weighted average will take into account users who actually rated the movie
    if (is.na(friend_rating)) next
    
    denominator <- denominator + friend_similarity
    numerator <- numerator + (friend_similarity * friend_rating)
  }
  
  return (numerator / denominator)
}
friend_names <- c('Hyun', 'Eric', 'Chunru')
new_movies <- c('NM1', 'NM2', 'NM3')
new_movie_predicted_ratings <- tibble()
for (n in new_movies) {
  predicted_rating <- movie_rating_weighted_average(n, friend_names)
  prediction_tibble <- tibble(movie = n, predicted_rating = predicted_rating)
  new_movie_predicted_ratings <- bind_rows(new_movie_predicted_ratings, prediction_tibble)
}
new_movie_predicted_ratings

Based on these predictions, it looks like you will probably like NM3 the most out of all the options.

Now that we were able to get a feel of how collaborative filtering works through cosine similarity, we will now discuss how to make recommendations using the recommenderlab library as Topor explains in his article.

We will start by converting our dataframe to a realRatingMatrix object:

# convert the movie ratings data frame to a matrix
rmat <- as.matrix(movie_ratings)

# convert matrix to a recommenderlab realRatingMatrix
rmat <- as(rmat, "realRatingMatrix")

We can then proceed to creating our Recommender models. The library contains the options we discussed earlier in this tutorial. Hahsler discusses this in the package documentation.

The type UBCF stands for User-Based Collaborative Filtering which we can use to apply cosine similarity and normalization. The implementation of recommenderlab also makes use of a k-nearest neighbors algorithm when building this type of recommender as seen in the package source code.

The type SVDF stands for Funk Singular Value Decomposition which implements the UV decomposition technique.

We will create three versions:

  1. Recommender using Cosine Similarity
  2. Recommender using Cosine Similarity with Normalized values
  3. Recommender using UV decomposition
set.seed(6030)
# Non-normalized cosine
reco <- Recommender(rmat, "UBCF", 
      param=list(normalize = NULL, method="Cosine"))

# Normalized centered cosine
reco_centered <- Recommender(rmat, "UBCF", 
      param=list(normalize = "center", method="Cosine"))

# UV decomposition
# The parameter k is the missing dimension for the decomposed matrices. 
# For this library, it cannot be greater than the number of users or the number of items
reco_uv <- Recommender(rmat, "SVDF", 
      param=list(k = 4))

Now we can check the results:

predictions <- predict(reco, rmat, type="ratings")
print('Non-normalized Cosine Similarity')
#> [1] "Non-normalized Cosine Similarity"
predictions@data
#> 4 x 7 sparse Matrix of class "dgCMatrix"
#>          WM1   WM2 WM3 WM4   NM1   NM2  NM3
#> Hyun   .     2.345   .   . .     .     .   
#> Eric   3.981 .       3   . 3.057 .     .   
#> Chunru .     .       3   . .     2.954 .   
#> You    .     .       .   . 3.051 2.984 3.67
predictions_centered <- predict(reco_centered, rmat, type="ratings")
print('Normalized Cosine Similarity')
#> [1] "Normalized Cosine Similarity"
predictions_centered@data
#> 4 x 7 sparse Matrix of class "dgCMatrix"
#>          WM1   WM2   WM3 WM4   NM1   NM2   NM3
#> Hyun   .     1.553 .       . .     .     .    
#> Eric   4.293 .     3.536   . 5.413 .     .    
#> Chunru .     .     2.748   . .     1.627 .    
#> You    .     .     .       . 4.093 2.509 3.801
predictions_uv <- predict(reco_uv, rmat, type="ratings")
print('UV Decomposition')
#> [1] "UV Decomposition"
predictions_uv@data
#> 4 x 7 sparse Matrix of class "dgCMatrix"
#>          WM1   WM2   WM3 WM4   NM1   NM2   NM3
#> Hyun   .     3.008 .       . .     .     .    
#> Eric   4.065 .     4.040   . 4.044 .     .    
#> Chunru .     .     3.246   . .     3.240 .    
#> You    .     .     .       . 3.552 3.542 3.561

As we can see from the results, NM3 has the highest predicted rating for 2 out of the 3 models. This just shows how recommender systems can be built using different distance measures and algorithms and may produce varying results.

The recommenderlab library also provides ways to measure models against each other. To show this, we will need a bigger dataset. We will use a larger dataset of movie ratings from Netflix. We did some pre-processing on the data to construct it into a utility matrix format. To check that function, you can check here and you can also download the csv file here.

netflix <- read.csv('unscaled_title_data.csv')
head(netflix)

We will then follow similar steps as earlier, by converting the dataframe into a matrix.

# For this example, we will remove the customer id column
netflix_small <- netflix[, 2:51]

# Filter to users who have made atleast 10 movie ratings
netflix_small <- netflix_small[rowSums(is.na(netflix_small))<40,]

# convert the netflix data frame to a matrix
netflix_rmat <- as.matrix(netflix_small)

# convert matrix to a recommenderlab realRatingMatrix
netflix_rmat <- as(netflix_rmat, "realRatingMatrix")

We can then split the data using the evaluationScheme function. We can also set a goodRating value. Since the ratings are on a scale of 1-5, we will use 3 for this example.

set.seed(6030)

# split the data into the training and the test set:
split_data <- evaluationScheme(
  netflix_rmat, 
  method="split", 
  train=0.8, 
  k=1, 
  given=10, 
  goodRating=3
)

We can then build our models using the training data:

set.seed(6030)
# Non-normalized cosine
netflix_reco <- Recommender(getData(split_data, "train"), "UBCF", 
      param=list(normalize = NULL, method="Cosine"))

# Normalized centered cosine
netflix_reco_centered <- Recommender(getData(split_data, "train"), "UBCF", 
      param=list(normalize = "center", method="Cosine"))

# UV decomposition
netflix_reco_uv <- Recommender(getData(split_data, "train"), "SVDF", 
      param=list(k = 4))

The calcPredictioAccuracy function calculates the Root Mean Square Error (RMSE), Mean Squared Error (MSE), and Mean Absolute Error (MAE)s for each of the models. This follows an approach suggested in the recommenderlab package, where we first use the known portion of the test set to make predictions and then calculate the error between those predictions and the test data’s unknown portion (Topor).

netflix_predictions <- predict(netflix_reco, getData(split_data, "known"), type="ratings")
netflix_predictions_centered <- predict(netflix_reco_centered, getData(split_data, "known"), type="ratings")
netflix_predictions_uv <- predict(netflix_reco_uv, getData(split_data, "known"), type="ratings")

error_results <- rbind(
  cosine = calcPredictionAccuracy(netflix_predictions, getData(split_data, "unknown")),
  cosine_normalized = calcPredictionAccuracy(netflix_predictions_centered, getData(split_data, "unknown")),
  uv = calcPredictionAccuracy(netflix_predictions_uv, getData(split_data, "unknown"))
)
error_results
#>                     RMSE    MSE    MAE
#> cosine            0.9842 0.9686 0.7772
#> cosine_normalized 0.9197 0.8458 0.7072
#> uv                0.9241 0.8540 0.7020

From these results, we can see that the Cosine Normalized model produced the lowest RMSE and MSE values for the Netflix data.

We discussed a few ways to create and test recommendation models in this section, but there are many more for you to explore. If you want to explore other methods, Topor discusses a lot of interesting variations in his article. The recommenderlab documentation and code repository is also very detailed.

In addition, we will discuss a simple example coded in Python as well to show the steps for a bigger example.

Coding Example (Python)

We will be using the same Netflix data we used earlier. The full notebook for this example can be found here.

For this Python example, we will be using the cosine_similarity function from the sklearn library. We will implement collaborative filtering using cosine similarity and normalizing the utility matrix by subtracting the average rating of each user.

First, we have to import the necessary libraries for this example:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.metrics.pairwise import cosine_similarity

We will start by reading in the data as before:

rating_df = pd.read_csv('unscaled_title_data.csv')
rating_df.head()

Rating Dataframe

We will then normalize the ratings using the technique we discussed earlier, by subtracting each user rating with the average rating of that user.

scaled_df = rating_df.copy().iloc[:,1:]
averages = scaled_df.mean(axis=1).values
for i in range(len(rating_df)):
    # Subtract by mean of each user, not the entire user
    scaled_df.iloc[i,:] = (scaled_df.iloc[i,:] - averages[i]) 

scaled_df = scaled_df.fillna(0) # fillna(0) should come after scaling
scaled_df.insert(loc=0, column='Cust_Id', value=rating_df["Cust_Id"])
scaled_df

Scaled Dataframe

Now we will create functions for finding similarities for users:

def indices(lst, item):
    return [i for i, x in enumerate(lst) if x == item]
def return_similarity(current_rating, idx):
    similarity_list = []
    for i in range(len(scaled_df)):
        if i == idx:
            continue
        another_rating = np.array([scaled_df.iloc[i,1:]]) # rating of another user
        curr_similarity = round(cosine_similarity(current_rating, another_rating)[0][0], 3)
        similarity = {
            'user_idx': i,
            'cosine_similarity': curr_similarity
        }
        similarity_list.append(similarity)

    return similarity_list

We also have a function for displaying our results later on:

def plot_rating(expected_dict, idx):
    title = expected_dict.keys()
    rating = expected_dict.values()
    recommended = []
    for score in rating:
        if score > 0:
            recommended.append(True)
        else:
            recommended.append(False)
            
    df = pd.DataFrame({
        "Title": title, 
        "Expected Rating": rating, 
        "Recommended": recommended
    }).sort_values(by = "Expected Rating", ascending = False)
    plt.figure(figsize=(20, 7))
    plt.xticks(rotation=90)
    plt.title("Predicted Rating for Unwatched Movies and Recommendation")
    
    # return the plot and a dataframe of the top 10 recommended movies
    return (sns.barplot(x = "Title", y = "Expected Rating", hue = "Recommended", data = df), df.iloc[:10,:])

Get the list of movie titles:

movie_titles = scaled_df.columns[1:]

This is the main function. We will use weighted averages to recommend movies for a user. The weights used are the similarity scores between users. Only users who have rated the movie in question will be included in the weighted average.

This function will be making predictions on the movies that a user has not watched yet. Since our ratings our normalized, positive values would mean better ratings than the negative ones.

def get_similarity(idx):
    current_rating = np.array([scaled_df.iloc[idx,1:]]) # rating of the current user
  
    # BASE CASE: if the user has watched all movies, then returns nothing.
    if 0 not in current_rating: 
        print("This user has watched all the movies.")
        return -1

    similarity_list = return_similarity(current_rating, idx)

    ratings = current_rating[0].tolist()
    expected_dict = dict()
    for movie in movie_titles:
        rating = scaled_df[movie].iloc[idx]
        
        # skip if the user has already seen a movie
        if rating != 0: 
            continue

        numerator = 0
        denominator = 0        
        for similarity in similarity_list:
            other_idx = similarity['user_idx']
            other_similarity = similarity['cosine_similarity']
            other_user_rating = round(scaled_df[movie].iloc[other_idx], 3)
            
            # we do not consider the similarity of unwatched user 
            # because it only affects the denominator but not numerator
            if other_user_rating == 0: 
                continue

            # numerator is iteratively adding up each rating x each similarity score
            numerator = numerator + (other_user_rating * other_similarity)
            denominator += other_similarity

        if denominator == 0: # avoid zero division error
            expected_rating = 0
        else:
            expected_rating = round(numerator / denominator, 2)

        expected_dict[movie] = expected_rating

    return plot_rating(expected_dict, idx)

Let’s put these functions to work and make recommendations for some users!

# user at index 1
res1 = get_similarity(1)

User at Index 1 Results

# top movies recommended for user at index 1
res1[1]

User at Index 1 Movie Rankings User at Index 1 will probably not enjoy the new movies recommended to him, but out of all the movies, The Patriot is at the top of the list.

# user at index 160
res2 = get_similarity(160)

User at Index 160 Results

# top movies recommended for user at index 160
res2[1]

User at Index 160 Movie Rankings User at Index 160 will also probably not enjoy the new movies recommended to him, but out of all the movies, The Patriot is also at the top of his list.

# user at index 200
res3 = get_similarity(200)

User at Index 200 Results

# top movies recommended for user at index 200
res3[1]

User at Index 200 Movie Rankings User at Index 200 will probably enjoy the top two new movies recommended to him, which are Spider-Man 2 and Mystic River.

# user at index 440
res4 = get_similarity(440)

User at Index 440 Results

# top movies recommended for user at index 440
res4[1]

User at Index 440 Movie Rankings User at Index 440 will probably enjoy the top new movie recommended to him, which is The Last Samurai.

# user at index 498
res5 = get_similarity(498)

User at Index 498 Results

# top movies recommended for user at index 498
res5[1]

User at Index 498 Movie Rankings User at Index 498 will also probably enjoy the top two new movies recommended to him, which are The Last Samurai and Mystic River.

Using cosine similarity, we were able to create a simple Collaborative Filtering Recommendation model. Go ahead and try out different techniques as well!

Conclusion

References

Gormley, M., (2017, April 19). Lecture: Matrix Factorization and Collaborative Filtering. Machine Learning Department, Carnegie Mellon University. Retrieved December 13, 2022, from https://www.cs.cmu.edu/~mgormley/courses/10601-s17/slides/lecture25-mf.pdf

Hahsler, M., (2022, August 17). recommenderlab: Lab for Developing and Testing Recommender Algorithms. Cran. Retrieved December 13, 2022, from https://cran.r-project.org/web/packages/recommenderlab/

Hahsler, M., (2022, August 17). R package recommenderlab - Lab for Developing and Testing Recommender Algorithms. Github. Retrieved December 13, 2022, from https://github.com/cran/recommenderlab

Netflix, (2019, November 13). Netflix Prize data. Kaggle. Retrieved December 7, 2022, from https://www.kaggle.com/datasets/netflix-inc/netflix-prize-data

Shummon Maas, L., (2019, June 11). Recommendation Systems using UV-Decomposition. Towards Data Science. Retrieved December 13, 2022, from https://towardsdatascience.com/recommendation-systems-using-uv-decomposition-a1d4116be4a1

Topor, J., (2017, June 11). User-Based and Item-Based Collaborative Filtering. RPubs. Retrieved December 13, 2022, from https://rpubs.com/jt_rpubs/285729

Ullman, J., Leskovec, J., Rajaraman, A., (2012, July 4). Mining Massive Datasets. Cambridge University Press. Retrieved December 7, 2022, from http://infolab.stanford.edu/~ullman/mmds/book.pdf